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Abstract 

T\vo mathematical programming approaches are presented for the assembly of ability 
tests from item pools calibrated under a multidimensional IKT model. Items selection 
is based on Fisher’s Information matrix. Several criteria can be used to optimize this 
matrix. In this paper the A-criterion and the D-criterion are applied. In a mathematical 
programming approach, both criteria provide good results for the two dimensional case. 
Empirical examples for a two-dimensional mathematics item pool illustrate the methods. 
Recommendations are provide about when to apply either approaches. 

Keywords: Greedy heuristic, linear approximation, mathematical programming, 
multidimensional IKT, optimal test assembly, test design. 
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Introduction 



In educational measurement, item response theory (IKT), (Bimbaum,1968), is 
generally used as a psychometric theory to govern the test assembly process. In this 
process, three steps can be distinguished. First an IKT model is chosen and the items in 
the item bank are calibrated. From an item bank many different tests can be assembled. 
Therefore, the second step consists of specifying the properties of the desired test, 
for example, the test length, the desired amount of information in the test, or the 
administration time needed for the test. The third step of the process is to choose an 
algorithm that selects items from the item bank such that the test specifications are met. 
A mathematical programming approach is often used for this step. 

The idea of using a mathematical programming approach was suggested by Yen 
(1983) and Theunissen (1985). Ever since, several papers have proposed various LP 
algorithms and heuristics to solve test assembly problems. Recently, van der Linden 
(1996), Segall (1996) and Luecht (1996) addressed the subject of assembling tests 
measuring multiple abilities, i.e. from an item bank calibrated using a multidimensional 
IKT (MIRT) model. 

Measuring multiple abilities using a MDRT model is not always seen as a practical 
option (Wainer et al. 1990). However, the main advantage of MIKT is increased 
measurement efficiency (Segall, 1996). When the dimensions measured in the test 
have non-zero correlations, items with a content classification in one dimension provide 
information about the other dimensions. MDRT models have been subject of research 
for many years (Bock, Gibbons and Muraki, 1988, Ackerman, 1994; McKinley and 
Reckase, 1983, and Reckase, 1985). Much research has been carried out in order to decide 
whether correlations are low enough to represent significant dimensions (McDonald, 
1981; Reckase, 1979; Stout, 1987). Once a significant number of dimensions has been 
confirmed, the item parameters can be estimated. Programs as NOHARM (Fraser and 
McDonald, 1988) and TESTFACT (Wilson, Wood, and Gibbons, 1987) are often used in 
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this step. Based on these parameters, items can be selected from an item bank, to assemble 
specified tests. 

In van der Linden (1996), an algorithm to solve the problem of assembling 
constrained tests that measure multiple traits was provided. However, intervention by 
the test assembler was needed to find the best solution. Therefore, the purpose of the 
present study was to find a heuristic that provides good solutions to the problem of 
assembling tests measuring multiple traits in a fully automated fashion. Two algorithms 
were developed. The first algorithm is based on a linear approximation of the objective 
function. Second, Luecht’s (1996) algorithm for adaptive testing is adjusted in order to 
apply it to the problem of assembling constrained multidimensional P&P tests. 

In the remainder of the paper, the MIRT model used for calibrating the item is 
described. Then a multidimensional version of a maximin model (van der Linden & 
Boekkooi-Timminga, 1989) is presented. Subsequently, two algorithms developed to 
assemble optimal tests are introduced. Both algorithms are compared by applying them 
to empirical examples. Finally both algorithms are discussed and recommendations for. 
their use are provided. 

A Linear Logistic MIRT Model 



The MIRT Model 

The model considered in this paper is a generalization of the two-parameter logistic 
model (Lord, 1980) to the multidimensional case (Reckase,1985). It can be formulated 
in the following manner: 



Pi(0j) = P(Uij = l\{ai, di,0j)) 

g(a < -9 j +di) 

= 1 + g(a»-0j+di) ’ 



( 1 ) 

( 2 ) 



where Pi{0j) is the probability that a person j = 1 . . . J with ability vector 6j gives a 
correct response Uij to an item i = 1 . . . I, a* is the vector of discrimination parameters 



I 
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of item i along the abilities 6j\ . . . 6 jm , m is the dimensionality of the ability space, and di 
is the parameter representing the difficulty of the item. In this paper, the item parameters 
are supposed to be known and the model is used to estimate the ability vectors 0j from 
realizations of the response variables = Uij for * = 1 . . . I and j = 1 ... J. 



Fisher’s Information 

In the multiparameter case, Fisher’s information is a matrix instead of a scalar 
(Lehmann, 1983; Segall 1996). Therefore, the (asymptotic) variances of the MLEs of the 
ability parameters 6 \, .., 6 m are given by the diagonal elements of the inverse of Fisher’s 
information matrix. For notational simplicity we consider the case of m = 2. Then Fisher 
’s information matrix, and the variance-covariance matrix are of the following forms: 



In order to optimize measurement precision, either Fisher ’s Information matrix has 
to be maximized or the variance-covariance matrix has to be minimized. 



A maximin model was used to formulate the test assembly process. The model 
consists of an objective function that has to be optimized over the set of possible tests 
meeting the specifications. The set is typically delineated by a number of mathematical 
constraints. But before the model is formulated, decision variables are introduced. 

In the above mentioned matrices, sums are taken over the n items in the test. In the 
test assembly process, it is unknown which items will be choosen in the test beforehand. 
To guarantee that the matrices are calculated for the items that are in the test, decision 
variables Xi have to be introduced for every item, where = 1 if item i is in the 




( 3 ) 



and 



v(e\0) 



( 4 ) 



A Multidimensional Maximin Model 
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test, Xi = 0 if item i is not in the test. / is the number of items in the item bank. 
All test specifications also have to be expressed in terms of the Xi s. For example, the 
variance functions, that is, the diagonal elements of the variance-covariance matrix, can 
be rewritten such that: 



Var&lO) 



and 



Var(e 2 \0) 



1 


a 2 iPiQi X i^ 




(eL,« IP'QiX,) 


1- 





1 


[n»=l a liPiQi X ij 




a lt PiQi x i^ (Hi=l a 2i PiQi X ij 


1- 





(5) 



( 6 ) 



where the sums are taken over the items in the item bank. 



Objective Function 

As described above, for precise measurement of multiple traits, either Fisther’s 
information matrix should be maximized or the variance-covariance matrix should be 
minimized. From optimum design theory, several criteria for optimality of matrices are 
known (see, for example, van der Linden, 1994). In this paper, A-optimality and D- 
optimality are considered. 

The A-optimality criterion minimizes the trace of variance-covariance matrix, that 
is, the sum of the variance functions. By assigning weights to the variance functions, 
the relative importance of the different abilities can be specified. In this way, different 
cases of multidimensional test assembly can be treated. The optimallity criterion for the 
two-dimensional case can be formulated in the following manner: 



min w\ Var (#i |0) + w 2 Var(9 2 \0). 



(7) 
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When traits are considered equally important, both weights are set equal to each other. A 
different case occurs when items are sensitive to multiple abilities, but the test is developed 
to measure only one intentional ability, than the weights of the other abilities can be set 
equal to zero. These and other cases are described in van der Linden (1996). Besides 
these cases, weights can also be used in a different manner. When the magnitudes of both 
variance functions differ, weights can be used to rescale both functions in the criterion 
such that the larger term does not dominate the smaller one in the optimization process 
any more. 

The function defined in Equations 7 is not only a function of x iy but also a continuous 
function of the variables (0i,02)- However, in the test assembly process it suffices to 
optimize these objective functions for a grid of points, instead of for the entire 0-region. 
For example, Theunissen (1985) reduced the problem of maximizing the information 
function over the 0-region the problem of maximizing the ability function at certain 0- 
points. This technique can also be applied at a multidimensional 0-space. Let the two- 
dimensional grid be defined by (s,£), where $ = 1, ..., S and t — 1, ...T. The resulting 
objective function is: 



vnin wiVar(Qi\O st ) + W2Var(9 2 \d s t) (8) 

3 = 1 . .S 
t=l..T 

This is a complicated objective function. Substituting both variance functions by 
Equation 5 and Equation 6 would result in: 



min max W\ 

3=1. ..S 
t=l...T 



I 


^St=l PiQ* X i) 






(e;.,4 PiQi^j 


1- 


PiQi x i^ 



-hw 2 - 2 - 

a \i PiQi X ij a 2t PiQi x i^J ~~ ^^*=1 a U a 2iPiQi x ij 
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It should be noted that the denominators of both terms are equal. So, the objective function 
can be rewritten into: 

W\ i &2i P lQi X i) “I" ^2 fSt=l a iiPiQi X i) 
min max — — r^- (10) 

a liPiQi X ij [Ei=l a 2jPi lQi x ij ~ a li a 2iPiQi x iJ 

The second criterion is the D-optimality. This criterion maximizes the determinant 
of Fishers information matrix, that is: 

maxdet 1(0). (11) 

When this function is optimized for a grid of points ( $ y t ), the resulting objective 
function is: 



max mm detl(9 st ). 

8=1., S 
t=l..T 

For the D-optimality no simplifications are needed. In the two-dimensional case, the 
objective function associated with D-optimality is equal to: 



i i 

max min 

tsal.'.T i- 1 i= 1 



0*1 {P lQi x i 



I 

( ^ ^ Ond2iP • 
i= 1 



( 12 ) 



Model 

Several kinds of constraints can be formulated. In van der Linden (1998) categorical 
constraints, quantitative constraints and constraints on inter-item dependencies are 
distinguished. Categorical constraints deal with, for example, content classification, 
gender orientation, or minority orientation. For quantitative constraints one could think 
of response times or word count constraints. When items contain clues to each other they 
are in an enemy set and only one item from this set is allowed in a test. 
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Using the maximin approach (van der Linden & Boekkooi-Umminga, 1989), the 
following mathematical programming problem for A-optimality has to be solved: 



min max wiVar(Oi\0 3t ) + W2Var(02\O s t), (13) 



subject to: 



Y, Xj < nc, (categorical constraints) (14) 

»6C 

< uq, (quantitative constraints) (15) 

isQ 

Y Xj < 1, (enemy sets) (16) 

iSE 

I 

'Y Xj = n, (test length) (17) 

i=i 

li e {0,1}, i — (18) 



The parameters nc are the bounds that determine the number of items from the 
subset C to be in the test. Bounds for the quantitative constraints are denoted by uq. 
Constraint 17 determines the test length and constraint 18 defines the decision variables. 
For the criterion of D-optimality a similar model can be described. However, the objective 
function is defined by Equation 12. 



Algorithms 

In this paper, a new algorithm, based on linear approximation of the objective 
functions, is proposed. This algorithm can be applied at unconstrained test assembly 
problems, and at constrained ones. In order to evaluate the algorithm, two algorithm were 
used as benchmarks. The first algorithm is a generalization of Lueght’s (Lueght,1996) 
greedy algorithm for MAT to the P&P case. The second algorithm randomly select 
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items from the item bank. Therefore, it is only applicable at unconstrained test assembly 
problems. 

Linear Approximation of the Objective Function 

The algorithm is based on linear programming techniques. These techniques are 
often applied in automated test assembly. However, they optimize a linear objective 
function subject to a number of constraints on the test attributes. In order to apply these 
techniques to the problem at hand, a linear approximation of the objective function in 
Equation 13 should be made. The general formula for the linear approximation of a 
function / at a given point x is given by /(x) + V /(5c)‘(x — 5c), where V is the vector of 
first order derivatives (See, for example, Bazaraa, Sheraldi and Shetty,.1993, page 121). 

When a linear approximation of the objective function is calculated, a point 6 has 
to be chosen where the function is approximated. Unfortunately it is unclear in advance, 
which 0-point to choose. Therefore, it was decided to optimize the worst performance of 
the linear approximation of the objective functions over the gridpoints (s, t ). The linear 
approximation of the simplified objective functions in Equation 10 and Equation 12 are 
given in Appendix A. 

In Appendix A it is shown that the approximations only differ in the values of kj St . 
So, the resulting test assembly problem for both the objective functions in Equation 10 
and Equation 12 can be formulated as follows: 



i 



i 



i 



mm max T ^2st @>2 *1” fast ait a 2i PiQiXi (19) 




subject to: 




riQ, (quantitative constraints) 



nc, (categorical constraints) 



1, (enemy sets) 



( 20 ) 



( 21 ) 



( 22 ) 
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x, = n, (test length) (23) 

»=i 

x t G {0,1}, i = l... I. (24) 

where the coefficients k jat are determined by the linear approximation (see Appendix A). 
Now the objective function is a linear function of the decision variables £,, and general 
automated test assembly techniques can be used to solve the problem. 

Greedy algorithm 

For the assembly of adaptive tests that measure multiple abilities, different 
approaches are at hand. In Segall (1996), a locally optimal item selection procedure for 
MAT is described. Each time an item is selected that provides the largest decrement in 
the volume of the credibility ellipsoid. In van der Linden (1999) the item is selected that 
minimizes the variance of a linear combination of the abilities. In Luecht (1996), Segall s 
approach is extended to the constrained case, by building the total test content constraints 
into the objective function. In these procedures the item is selected that contributes most 
to the objective function in each iteration. Therefore they are in the class of the so-called 
greedy algorithms. 

The adaptive strategies can be applied rather straightforwardly in the context of 
assembling tests. Items that contribute most to the objective function have to be 
sequentially selected, until the maximum number of items in the test is reached. In case 
of no constraints this greedy heuristic selects the item whose value of 

W l (S<=1 °2 iPiQi x i) + w 2 fSi=l a lt P<Qi x ij 

max — — T2 

Str (Z'^ajiPiQiXi) (Z' =J o^PiQiXi) - “uaxPQiXi) 

is minimal. Or when the objective function in Equation 12 is used, the heuristic selects 
the item whose value of 

ii i 

min y Q.2iPiQi%i y ^ Q.\iPiQiXi ( y ^ ^ li^2i PiQiXt ) 

5 = 1 . .5 * 1 ^ " 
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is maximal. In case of constraints a different strategy should be used. All constraints can 
be built into the objective function (Luecht, 1996). The composite objective function is 
optimized. When this strategy is applied to the problem in equation 13 to 18, the following 
problem has to be solved: 

4 

min d j d j 

3 = 1 

subject to: 

T — min max w\Var(6i\6 st ) + w?y ar{02\0 st) 

5=1 . -S 
t=l...T 

max(£ Xi-nc, 0) 
tec 

max(y^ qiXi — tiq , 0) 
ieQ 

max (]T:Ci - 1,0) 
ieE 

i=i 

Xi 



= d \ , 

= 

= ^ 3 , 

= d^ , 

= n, 

€ { 0 , 1 }, 



where T is a prespecified target for the objective function, dj is the deviation of the j- 
th constraint, and otj denotes the weight of the deviation of the j-th constraint. Like 
the non-constraint case, each iteration the item is added to the test that minimizes the 
objective function. A target T for the objective function can be obtained by solving the 
non-constraint problem first. 

Random Item Selection Algorithm 

Items are randomly selected from the item bank until the maximum number of items 
is reached. Since it is impossible to take constraints into account in this algorithm, it is 
only applicable at unconstrained test assembly problems. The tests resulting from this 
algorithm can be used as benchmarks. In order to be usefull, a proposed algorithm should 
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produce test that are at least as good as the tests provided by the random item selection 
algorithm. 



Empirical Examples 

An ACT Assessment Program Mathematics Item Pool was used to assemble tests 
from. The item pool consisted of 176 items. The calibration of this items was carried 
out with the program NOHARM (Fraser and McDonald, 1988), and an acceptable fit 
was obtained by a two dimensional version of the model in Equation 1. The items were 
classified according to content specification and skill. 

First, the problem without constraints was solved for the A-criterion and the D- 
criterion. The linear approximation, the greedy heuristic and the random selection 
algorithm were applied. In this way the loss due to the linear approximation of the 
objective function was examined for both optimality criteria. Second, the effects of 
adding constraints to the model were investigated for the greedy heuristic and the linear 
approximation. Therefore several sets of constraints were added to the problem (See Table 
1 ). 

The main program for solving these problems was written in PASCAL 7.0 and 
heuristic seven of the program Contest (Timminga, van der Linden and Schweizer, 1996) 
was used to solve the linear programming parts of the examples. 

Example 1 

In the first exampel tests were assembled over the complete grid of points defined 
by (0i, # 2 ) € {-1, 0, 1} x {-1, 0, 1}. The test had to contain a fixed number of items. 
No further constraints on item or test attributes were defined. The number of items in 
the test varied from 10 to 50. The results of a heuristic that randomly selected the fixed 
number of items from the item pool were added to serve as a bench mark. In Figure 1 the 
resulting values for the A-criterion objective function are shown. 
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Insert Figure 1 at about here 

As can be seen, solving the test assembly problem with a linear approximation of the 
A-criterion objective function does not provide good tests. The approximation performed 
hardly better than random item selection algorithm. The greedy algorithm provided 
much better results. For the D-criterion the results are shown in Figure 2. Remind that 
the objective of the second criterion is to maximize the objective function instead of to 
minimize, so the higher the results, the better. 



Insert Figure 2 at about here 

For this criterion, the linear approximation performed much better. The assembled 
tests contained much more information than a random selection of items from the pool, 
and almost as much information as the greedy test. 

Example 2 

For the D-criterion both the greedy algorithm and the linear approximation were 
compared for a number of additional constraints. The following content and skill 
constraints were added to the problem: 

(1) The test should contain at least n P c plane geometry, n PA pre- algebra, n EA 
elementary algebra, ucg coordinate geometry, titg trigonometry, and nj A intermediate 
algebra items. 

(2) At least n B s basic skill items, n AP application items, and n AN analysis items 
should be included in the test. 

Hence the following additional constraints were obtained: 



y] X{ > n P G, (25) 

i£VpG 
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x « 

itVpA 


> 


npA, 


(26) 


teVjcM 


> 


kea, 


(27) 


i^VcG 


> 


ncG, 


(28) 


iGVtg 


> 


n TG> 


(29) 


ieViA 


> 


niA, 


(30) 


i€VBA 


> 


nBS, 


(31) 


iGVap 


> 


n-AP, 


(32) 


itVAN 


> 


Kan, 


(33) 



where for example Vpg is the set indices of the items with content classification Plane 
Geometry (PG). Several sets of constraints were tested. Each violation of a constraint 
was counted as a fault. The weights of all constraints in the greedy heuristic were set 
equal to one. The lowest value of the D-criterion on the grid of 0-points defined by 
(— 1, 0, 1) x (—1, 0, 1) and the number of faults are presented in Table 1. In this example 
the test length was set equal to twenty-five. 



Insert Table 1 at about here. 



The greedy heuristic assembled tests with a higher value of the D-criterion. When the 
constraints were easy to meet, that is, when enough items were present in the item pool, 
no violations of constraints were made. However, when the constraints were hard to meet, 
violations occured when the greedy heuristic was used. For the sixth set of constraint even 
seven faults were counted. 
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Discussion 

In this paper, a new algorithm for assembling tests from item pools that are calibrated 
using a multidimensional IKT model was proposed . The performance of the algorithm 
was compared with a greedy algorithm for both the A-optimality criterion and the D- 
optimality criterion. In the first example both algorithms were compared without any 
constraints in the model. For A-optimality the linear approximation did not prove to be 
useful. For D-optimality good tests were obtained. 

An explanation of these performances can be found in the formulas of both criteria. 
When the formulas in Equation 10 and Equation 12 are compared, it is easy to see that 
the first formula is far more difficult to approximate by a linear function than the second 
formula. Therefore, the linear approximation performed much worse for the A-criterion 
than for the D-criterion. The conclusion can be drawn that the algorithm based on a 
linear approximation of the objective function only performs well for simple nonlinear 
functions. In both examples the items were calibrated with a two-dimensional model. 
Since higher-dimensional models result in more difficult formulas, the question how this 
heuristic works out for higher-dimensional models needs additional research. 

When the results of the second example are compared with the unconstrained case, 
the differences between both algorithms are increased. In the unconstrained case, the 
values for D-optimality were 2.13 for the greedy algorithm and 1.83 for the linear 
approximation. The difference is 0.30. In the constrained cases the differences vary 
from 0.67 to 0.95. An explanation can be found in the way the tests are assembled by 
both algorithms. Because the greedy algorithm allows violations of constraints, it is 
less restricted by the constraints than the linear approximation. Therefore, the difference 
increases. As a result of this, it is hard to compare the algorithms. 

It depends on the item pool and the preferences of the test assembler which of 
both algorithms should be applied in practise. In large scale testing programs, different 
versions of a test should be comparable and it is important that all constraints are met. 
When the item pool is well designed and contains enough items to fulfill all constraints 
the greedy algorithm should be applied. The first two sets of constraints in the second 
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k>; 
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example illustrate this case. No violations were made and the resulting tests were far more 
informative than the linear approximation. On the other hand, when many constraints are 
formulated for the test and the item pool is hardly able to fulfill them, the greedy algorithm 
will result in many faults. For these cases the algorithm based on the linear approximation 
of the objective function should be applied. What remains are the cases where the greedy 
algorithm results in a few faults. For these cases no general recommendations can be 
given and the preferences of the test assembler will be decisive. 
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Appendix A. 

For a test the objective function for the A-optimality criterion is stated in Equation 
10. It is formulated in the following manner: 

wi (j2i=i aliPiQiX^j + w 2 (Z[=i a 2 u PiQiX^ 

Of^PiQiXij f ! (l^iPiQiX 1 0'li a 2i PiQi^i'j 

where the sums are taken over the items in the test. Three terms are present in this funcion. 
Define: 



min max — 
8=1. .S ( 
t=l..T ( 



/ 

X = ^ ^ O'li-PiQiXj^ 

i = 1 
/ 

• y = ^ ^2t PiQi^i) 

i=l 

/ 

Z zzz ^ ^ Q'liQ'2jF* *iQi'Ei • 
i=l 

For a given ability point and a given test, the functions x, y, and z can be 

calculated. The result is denoted by (x, y, z). 

The objective function can be rewritten into: 



min max /(x,y, z) 



where 



f(x,y.z) 



wi y + w 2 x 
xy — z 2 



The linear approximation of the objective function in the point (x, y, z) is equal to: 



df , . 

min max — (x, y, z) 
3=1.. s ox 
t=l..T 



df, , 

x + —(x,y,z) 



df, , 

y + -^{x, y , z ) ■ z + c 
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where c = f(x,y,z) + V/(x, y, zf • (x, y, z), and the partial derivatives are given by: 



kist 

^2 st 
^3 st 



W 2 



’ %8t ’ Vst 



dff X _ 

dx ^ st ’ Vst ' Zst x at ■ y at - z\ t (x st ■ y 3t - z 2 at ) 2 

df_ ( - _ _ _ w i wi • z«t • y st 

X **’ Vst ' Zst x st ■ y 3t - z 2 t ( x st • Vst - z 2 st ) 2 



dy 



df x _ 2 -z at ■ (wix at + wiy at ) 

dz [xst,y st ,z st) - 



Tests resulting from the greedy algorithm can be used to calculate k ut , fc 2st , and k 3st . 
For D-optimality the objective function is stated in Equation 12; 

ii l 

max Him ^ CL^^PiQiXi ^ ^ • 

5 = 1. .5 ' ^ ^ ^ J 



t=l..T i~ 1 



1=1 



1=1 



In terms of x, y and z the function is equal to: 



max mm xy — z . 

5 = 1 . .S 
t=l..T 



The partial derivatives, that define the linear approximation, are given by: 
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fast 
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Qy {x 3 t , y s i, z at ) — x 3 t 

df _ v 

-fo{Xst,y 3t ,Zst) = -2 • z 3t 



The coefficients kj 3t are equal to the partial derivatives V/ evaluated at the points 
(s, t) for a given reference test. 
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Table 1. 



Results of both algorithms for different sets of constraints. 
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Figure Captions 

Figure 1: Results of different algorithms for the A-criterion. 
Figure 2 : Results of different algorithms for the D-criterion. 



7 




Test length 



GREEDY 

LINEAR 

RANDOM 



BEST COPY AVAILABLE 



26 






4 



r 



! 




GREEDY 

LINEAR 

RANDOM 




BEST COPY AVAILABLE 

27 



Titles of Recent Research Reports from the Department of 
Educational Measurement and Data Analysis. 
University of Twente, Enschede, The Netherlands. 



RR-00-04 

RR-00-03 

RR-00-02 

RR-00-01 

RR-99-08 

RR-99-07 

RR-99-06 

RR-99-05 

RR-99-04 

RR-99-03 

RR-99-02 

RR-99-01 

RR-98-16 

RR-98-15 

RR-98-14 

RR-98-13 

RR-98-12 

RR-98-11 

RR-98-10 



B.P. Veldkamp, Constrained Multidimensional Test Assembly 

J.P. Fox & C.A.W. Glas, Bayesian Modeling of Measurement Error in 

Predictor Variables using Item Response Theory 

J.P. Fox, Stochastic EM for Estimating the Parameters of a Multilevel IRT 
Model 

E.M.L.A. van Krimpen-Stoop & R.R. Meijer, Detection of Person Misfit in 
Computerized Adaptive Tests with Polytomous Items 

W.J. van der Linden & J.E. Carlson, Calculating Balanced Incomplete Block 
Designs for Educational Assessments 

N.D. Verhelst & F. Kaftandjieva, A Rational Method to Determine Cutoff 
Scores 

G. van Engelenburg, Statistical Analysis for the Solomon Four-Group Design 
E.M.L.A. van Krimpen-Stoop & R.R. Meijer, CUSUM-Based Person-Fit 
Statistics for Adaptive Testing 

H. J. Vos, A Minimax Procedure in the Context of Sequential Mastery Testing 

B. P. Veldkamp & W.J. van der Linden, Designing Item Pools for 
Computerized Adaptive Testing 

W.J. van der Linden, Adaptive Testing with Equated Number-Correct Scoring 
R.R. Meijer & K. Sijtsma, A Review of Methods for Evaluating the Fit of Item 
Score Patterns on a Test 

J.P. Fox & C.A.W. Glas, Multi-level IRT with Measurement Error in the 
Predictor Variables 

C. A.W. Glas & H.J. Vos, Adaptive Mastery Testing Using the Rasch Model 
and Bayesian Sequential Decision Theory 

A.A. B6guin & C.A.W. Glas, MCMC Estimation of Multidimensional IRT 

Models 

E.M.L.A. van Krimpen-Stoop & R.R. Meijer, Person Fit based on Statistical 

Process Control in an AdaptiveTesting Environment 

W.J. van der Linden, Optimal Assembly of Tests with Item Sets 

W.J. van der Linden, B.P. Veldkamp & L.M. Reese, An Integer Programming 

Approach to Item Pool Design 

W.J. van der Linden, A Discussion of Some Methodological Issues in 
International Assessments 



RR-98-09 

RR-98-08 

RR-98-07 

RR-98-06 

RR-98-05 

RR-98-04 

RR-98-03 

RR-98-02 

RR-98-01 

RR-97-07 

RR-97-06 

RR-97-05 

RR-97-04 

RR-97-03 

RR-97-02 

RR-97-01 

RR-96-04 

RR-96-03 



B.P. Veldkamp, Multiple Objective Test Assembly Problems 

B. P. Veldkamp, Multidimensional Test Assembly Based on Lagrangian 
Relaxation Techniques 

W.J. van der Linden & C.A.W. Glas, Capitalization on Item Calibration Error 
in Adaptive Testing 

W.J. van der Linden, D.J. Scrams & D.L.Schnipke, Using Response-Time 
Constraints in Item Selection to Control for Differential Speededness in 
Computerized Adaptive Testing 

W.J. van der Linden, Optimal Assembly of Educational and Psychological 
Tests, with a Bibliography 

C. A.W. Glas, Modification Indices for the 2-PL and the Nominal Response 
Model 

C.A.W. Glas, Quality Control of On-line Calibration in Computerized 
Assessment 

R.R. Meijer & E.M.L.A. van Krimpen-Stoop, Simulating the Null Distribution 
of Person-Fit Statistics for Conventional and Adaptive Tests 
C.A.W. Glas, R.R. Meijer, E.M.L.A. van Krimpen-Stoop, Statistical Tests for 
Person Misfit in Computerized Adaptive Testing 

H.J. Vos, A Minimax Sequential Procedure in the Context of Computerized 
Adaptive Mastery Testing 

H.J. Vos, Applications of Bayesian Decision Theory to Sequential Mastery 
Testing 

W.J. van der Linden & Richard M. Luecht, Observed-Score Equating as a Test 
Assembly Problem 

W.J. van der Linden & J.J. Adema, Simultaneous Assembly of Multiple Test 
Forms 

W.J. van der Linden, Multidimensional Adaptive Testing with a Minimum 
Error-Variance Criterion 

W.J. van der Linden, A Procedure for Empirical Initialization of Adaptive 
Testing Algorithms t 

W.J. van der Linden & Lynda M. Reese, A Model for Optimal Constrained 
Adaptive Testing 

C.A.W. Glas & A.A. Beguin, Appropriateness ofIRT Observed Score Equating 
C.A.W. Glas, Testing the Generalized Partial Credit Model 



Research Reports can be obtained at costs, Faculty of Educational Science and Technology, 
University of Twente, TO/OMD, P.O. Box 217, 7500 AE Enschede, The Netherlands. 



29 




& $ 

? 4 lb 



,|t|- Sjh' 
^ ^ *# 

-f 

»*&* 4 - "# 



□ 



ERIC 




faculty of 

EDUCATIONAL SCIENCE 
AND TECHNOLOGY 

A publication by 

The Faculty of r EducationaLScience,and Technology of the University of Twente 
RO. Box 217 .* v "f 1 ^ 

7500 AE Enschede * | 

The Netherlands I ■ ♦ f 

.aft..... 




U.S. Department of Education 

Office of Educational Research and Improvement (OERI) 

National Library of Education (NLE) 

Educational Resources Information Center (ERIC) 

TM032319 




NOTICE 

REPRODUCTION BASIS 



This document is covered by a signed “Reproduction Release 
(Blanket) form (on file within the ERIC system), encompassing all 
or classes of documents from its source organization and, therefore, 
does not require a “Specific Document” Release form. 



This document is Federally-funded, or carries its own permission to 
reproduce, or is otherwise in the public domain and, therefore, may 
be reproduced by ERIC without a signed Reproduction Release form 
(either “Specific Document” or “Blanket”). 




